0 Introduction

This R Markdown script contains all the code used for outlier detection, data analysis and plotting; including additional statistical analyses and all the statistical models with summaries.

1 Experiment 1

1.1 Participant exclusions

The participant filtering process of Experiment 1 is detailed in the code below.

# Loading the data to filter out participants
dataExp1 <- read.delim("./dataAnonNotFilteredExp1_added.txt", sep ="\t", header = TRUE, encoding="UTF-8")

# Remove 11 suspicious participants
rmParticipant1Exp1 <- c("5970a90e","6e5ccd20","9e75832a","86687167","64afd2cc","429604bc","409f2599","d6390ddc","7ab6a320","bd90455b","39b29c2b")
dataExp1 <- dataExp1[!dataExp1$workerId %in% rmParticipant1Exp1,]

# Calculate dprime
# Get the hit rate: 4 or 5 to real words
dataExp1$accReal <- ifelse(dataExp1$type=="real" & dataExp1$enteredResponse %in% c(4,5),1,0)
hrateT <- aggregate(accReal ~ workerId, sum, data=dataExp1)
hrateT$hit_rate <- round(hrateT$accReal/132,3)

# Manually correct the hit rate of one participant (who did one less item)
#length(which(dataExp1$type=="real" & dataExp1$workerId=="c0999383"))
hrateT[hrateT$workerId=="c0999383",]$hit_rate <- round(hrateT[hrateT$workerId=="c0999383",]$accReal/131,3)

# Get the FA (false alarm) rate: 4-5 to nonwords
dataExp1$accNon <- ifelse(dataExp1$type=="non" & dataExp1$enteredResponse %in% c(4,5),1,0)
farateT <- aggregate(accNon ~ workerId, sum, data=dataExp1)
farateT$fa_rate <- round(farateT$accNon/209,3)

# Manually correct the FA rate of one participant (who did two less items)
#length(which(dataExp1$type=="non" & dataExp1$workerId=="f06d0f73"))
farateT[farateT$workerId=="f06d0f73",]$fa_rate <- round(farateT[farateT$workerId=="f06d0f73",]$accNon/207,3)
dprime <- merge(hrateT, farateT, by="workerId")
dprime$dprime <- round(qnorm(dprime$hit_rate) - qnorm(dprime$fa_rate),3)
dprime <- dprime[c("workerId","dprime")]
drop <- c("accReal","accNon")
dataExp1 <- dataExp1[,!(names(dataExp1) %in% drop)]
dataExp1 <- merge(dataExp1, dprime, by="workerId")
rm(dprime, hrateT, farateT)

# Remove 4 participants whose d-prime value is lower than 0
rmParticipant2Exp1 <- unique(dataExp1[dataExp1$dprime < 0,]$workerId) 
# a29fe31d e1f518e6 efdd3439 f57a3633 
dataExp1 <- dataExp1[!dataExp1$workerId %in% rmParticipant2Exp1,]           

# Remove one native speaker of Mandarin Chinese
dataExp1 <- dataExp1[!dataExp1$firstLang=="Mandarin",]

# Remove 11 participants whose speakMaori or compMaori is equal to or above 3
rmParticipant3Exp1 <- unique(dataExp1[dataExp1$speakMaori >= 3 | dataExp1$compMaori >= 3,]$workerId)
# 007a1752 170ce007 20c10896 3128bb29 66d0a920 75caca3b b00cd565 d3cd7085 de95cdaf eef4d9c0 fab8a51f
dataExp1 <- dataExp1[!dataExp1$workerId %in% rmParticipant3Exp1,]

# Remove one participant who did not learn their English in NZ and have been living overseas for more than two years.
summaryExp1WorkerId <- unique(dataExp1[,c("workerId","firstLangCountry","place","duration")])
EngNotInNZExp1 <- summaryExp1WorkerId[!summaryExp1WorkerId$firstLangCountry=="NZ",]
rmParticipant4Exp1 <- unique(EngNotInNZExp1[EngNotInNZExp1$place=="overseas",]$workerId) 
# 880242c2
dataExp1 <- dataExp1[!dataExp1$workerId %in% rmParticipant4Exp1,]

# Detect participant whose median reactionTime is shorter than 2*SD below the mean of all participants
median_RT <- aggregate(dataExp1$reactionTime, by=list(dataExp1$workerId), median)
names(median_RT) <- c("workerId","median")
cut <- mean(median_RT$median)-2*sd(median_RT$median)
# median_RT[!median_RT$median > cut,]$workerId # None detected!

# Check the total number of usable participants for Exp1
# length(unique(dataExp1$workerId)) # 101

# Add a column to indicate the presence of macrons
dataExp1$macron <- FALSE
dataExp1[grepl("ā|ē|ī|ō|ū",dataExp1$word),]$macron <- TRUE
dataExp1$macron <- as.factor(dataExp1$macron)

1.2 Dataset structure

The data is structured as follows:

  • workerId is the unique ID for each participant.
  • enteredResponse is the confidence rating for each stimulus.
  • reactionTime is the reaction time for each rating (seconds).
  • type is the classification of each stimulus: nonword (‘non’) or word (‘real’).
  • length is the phoneme length of each stimulus.
  • word is the stimulus used for the rating.
  • speakMaori is each participant’s report of how well they can speak Māori (on a scale from 0 to 5).
  • compMaori is each participant’s report of how well they can understand/read Māori (on a scale from 0 to 5).
  • maoriProf is the sum of quantified response for speakMaori and compMaori (participant Māori proficiency).
  • age is the age group for each participant.
  • gender is the gender reported by each participant.
  • ethnicity is categorized into binary answers, either Māori (M) or non Māori (non M).
  • education is each participant’s highest level of education.
  • children is each participant’s report of whether they have had any children who have attended preschool or primary school in New Zealand in the past five years.
  • maoriList is each participant’s basic knowledge of Māori (with a scale ranging from 0 to 9).
  • place is each participant’s current place of living (3 levels: NZ North Island, NZ South Island, or Overseas).
  • duration is each participant’s time living in their current place (2 levels: long is > 2 years; short is =< 2 years).
  • firstLang is each participant’s first language.
  • firstLangCountry is the country where each participant learned their first language.
  • anyOtherLangs is any other languages each participant reports speaking.
  • hawaii is the binary response to the question whether participants have lived in Hawaii.
  • anyPolynesian is the binary response to the question whether participants know any Polynesian such as Hawaiian, Tahitian, Sāmoan, or Tongan.
  • whichPolynesian is the information regarding participants’ knowledge of Polynesian languages, if they know any.
  • impairments is the answer to the question whether participants have a history of any speech or language impairments.
  • maoriExpo is each participant’s level of exposure to Māori (with a scale ranging from 0 to 10).
  • score is the phonotactic score based on the 1,629 most frequent morph types derived from all words in the dictionary, normalized by the phonemic length of stimuli, ignoring vowel length distinctions, assuming that participants are attempting to parse stimuli into morphs.
  • n.neighbors is the the number of words (from the Māori dictionary) that can be reached by adding, deleting, or substituting one phoneme in each stimulus.
  • mean.neighbor.logfreq is the frequency-weighted phonological neighbourhood density.
  • dprime is the measure of sensitivity for each participant’s performance.

1.3 Overview of participants’ sociolinguistic profile in Experiment 1

Overview of participants' sociolinguistic profile in Experiment 1. Bars are labeled with their counts for each category.

Overview of participants’ sociolinguistic profile in Experiment 1. Bars are labeled with their counts for each category.

1.4 Length distribution of real word stimuli

Length distribution of real word stimuli. The length of stimulus (the number of phonemes) is represented on the x-axis and the number of stimuli is represented on the y-axis.

Length distribution of real word stimuli. The length of stimulus (the number of phonemes) is represented on the x-axis and the number of stimuli is represented on the y-axis.

1.5 Average rating per word

Average word ratings by phonotactic score. The average rating per word is represented on the y-axis and the phonotactic score is represented on the x-axis. Overlapping labels are omitted.

Average word ratings by phonotactic score. The average rating per word is represented on the y-axis and the phonotactic score is represented on the x-axis. Overlapping labels are omitted.

1.6 Statistical analyses

1.6.1 Comparing the AIC score of four statistical models with measures for evaluationg participants

Comparison of the AIC score (Experiment 1)
Evaluation measure AIC
Dprime 65998.9
Basic knowledge 68789.2
Proficiency 70810.7
Exposure 70954.1

1.6.2 Modeling confidence ratings with participants’ d’ by an ordinal mixed-effects model

The model fitting procedure is described in the code below.

# Modify confidence ratings as discrete variables
dataExp1$enteredResponse <- as.factor(dataExp1$enteredResponse)

# Discard participants with infinite dprime
dataExp1DprimeFinite <- dataExp1[is.finite(dataExp1$dprime),]

# Model fitting: fixed-effects & mixed-effects
# m1 <- clm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*type*c.(dprime) + c.(score)*type*c.(dprime), data=dataExp1DprimeFinite)
# anova(m1, type="III") # remove type:c.(dprime):c.(length) 
# m2 <- clm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*type +  c.(length)*c.(dprime) + c.(score)*type*c.(dprime), data=dataExp1DprimeFinite)
# anova(m2, type="III") # remove c.(score):type:c.(dprime) 
# m3 <- clm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*type + c.(length)*c.(dprime) + c.(score)*type + c.(score)*c.(dprime), data=dataExp1DprimeFinite)
# anova(m3, type="III") # remove c.(dprime):c.(score)
# m4 <- clm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*type + c.(length)*c.(dprime) + c.(score)*type, data=dataExp1DprimeFinite)
# anova(m4, type="III") # remove type:c.(length)
# m5 <- clm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*c.(dprime) + c.(score)*type, data=dataExp1DprimeFinite)
# anova(m5, type="III")
# m6 <- clmm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*c.(dprime) + c.(score)*type + (1 + macron*type + c.(n.neighbors)*type + c.(length)|workerId) + (1 + c.(score)*type|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# Model didn't converge -> remove random slopes by workerId for all interactions 
# m7 <- clmm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*c.(dprime) + c.(score)*type + (1 + macron + type + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(length)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# Model didn't converge -> remove a random slope (length) by workerId
# m8 <- clmm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*c.(dprime) + c.(score)*type + (1 + macron + type + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# summary(m8) # model converged -> remove type:dprime:n.neighbors
# m9 <- clmm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*c.(dprime) + type*c.(n.neighbors) + c.(length)*c.(dprime) + c.(score)*type + (1 + macron + type + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# anova(m8, m9) # significant, choose m8 (do not remove type:dprime:n.neighbors) 
# summary(m8) # remove type:score
# m10 <- clmm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + macron + type + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# Model didn't converge -> remove random slopes (length and macron) by workerId
# m11 <- clmm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + type + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# Model didn't converge -> remove random slopes (length, macron and type) by workerId
# m12 <- clmm(enteredResponse ~ macron*type*c.(dprime) + c.(n.neighbors)*type*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# summary(m12) # model converged -> remove macron:type:dprime
# m13 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(dprime) + macron*type + macron*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# anova(m12, m13) # not significant, choose m13 (remove macron:type:dprime) 
# summary(m13) # remove type:macron
# m14 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(dprime) + macron*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# anova(m13, m14) -> significant, choose m13 (do not remove type:macron) 
# summary(m13) # remove n.neighbors:type:dprime
# m15 <- clmm(enteredResponse ~ c.(n.neighbors)*type + type*c.(dprime) + c.(n.neighbors)*c.(dprime) + macron*type + macron*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# anova(m13, m15) # significant, choose m13 (do not remove n.neighbors:type:dprime)
# summary(m13) # remove dprime:length
# m16 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(dprime) + macron*type + macron*c.(dprime) + c.(length) + c.(score) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# anova(m13, m16) # significant, choose m13 (do not remove dprime:length) 
# summary(m13) # remove dprime:macron
# m17 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(dprime) + macron*type + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# anova(m13, m17) # significant, choose m13 (do not remove dprime:macron) 
# summary(m13) # add random slopes (type and macron) by workerId to m13
# m13_1 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(dprime) + macron*type + macron*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score) + type + macron|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# Model didn't converge -> a random slope (type) by workerId to m13
# m13_2 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(dprime) + macron*type + macron*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score) + type|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# summary(m13_2) # AIC: 65998.93, add a random slope (macron) by workerId to m13
# m13_3 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(dprime) + macron*type + macron*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score) + macron|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# summary(m13_3) # AIC: 66330.89
# anova(m13_2, m13_3) # choose m13_2 (model with a lower AIC score) -> add an uncorrelated random effect (type) by workerId to m13_2
# m13_4 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(dprime) + macron*type + macron*c.(dprime) + c.(length)*c.(dprime) + c.(score) + (1 + c.(n.neighbors) + c.(score) + macron|workerId) + (0 + type|workerId) + (1+ c.(dprime)|word), data=dataExp1DprimeFinite)
# Model didn't converge, stay with m13_2
# saveRDS(m13_2, file = "exp1Dprime.rds")
mExp1Dprime <- readRDS("exp1Dprime.rds")
clm_table(mExp1Dprime, caption="Model summary of confidence ratings with participants' d'. All numeric variables in this model are centered.")
Model summary of confidence ratings with participants’ d’. All numeric variables in this model are centered.
Parameter Estimate Std. Error \(z\) \(p\)
Effects n.neighbors (centered) 0.054 0.021 2.549 0.011 *
type = real 4.128 0.214 19.323 <0.001 ***
dprime (centered) -0.138 0.175 -0.788 0.431
macron = TRUE 0.775 0.199 3.906 <0.001 ***
length (centered) 0.085 0.049 1.730 0.084 .
score (centered) 2.536 0.832 3.048 0.002 **
n.neighbors (centered) × type = real -0.021 0.022 -0.941 0.347
n.neighbors (centered) × dprime (centered) 0.030 0.011 2.597 0.009 **
type = real × dprime (centered) 1.829 0.175 10.476 <0.001 ***
dprime (centered) × macron = TRUE -0.558 0.081 -6.901 <0.001 ***
type = real × macron = TRUE -0.350 0.346 -1.012 0.312
dprime (centered) × length (centered) -0.048 0.024 -2.026 0.043 *
n.neighbors (centered) × type = real × dprime (centered) -0.029 0.011 -2.571 0.010 *
Thresholds 1|2 -3.147 0.159
2|3 -0.825 0.157
3|4 1.357 0.157
4|5 2.821 0.158

1.6.3 Effect plots from the model with participants’ d’

Effect plots of: the interaction between the neighbourhood density, the distinction between non vs. real word stimuli, and participant d' (Fig.a); the interaction between the stimulus length and participant d' (Fig.b); the interaction between the presence of macrons and participant d' (Fig.c); and the phonotactic score (Fig.d). For plots involving interactions with paricipant d', the upper panel represents participants with high d', and the lower panel represents participants with low d'. Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

Effect plots of: the interaction between the neighbourhood density, the distinction between non vs. real word stimuli, and participant d’ (Fig.a); the interaction between the stimulus length and participant d’ (Fig.b); the interaction between the presence of macrons and participant d’ (Fig.c); and the phonotactic score (Fig.d). For plots involving interactions with paricipant d’, the upper panel represents participants with high d’, and the lower panel represents participants with low d’. Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

1.6.4 Modeling confidence ratings with participants’ self-rated Māori proficiency by an ordinal mixed-effects model

The model fitting procedure is described in the code below.

# Model fitting: fixed-effects and mixed-effects
# m1 <- clm(enteredResponse ~ macron*type*c.(maoriProf) + c.(n.neighbors)*type*c.(maoriProf) + c.(length)*type*c.(maoriProf) + c.(score)*type*c.(maoriProf), data=dataExp1)
# anova(m1, type="III") # remove macron:type:c.(maoriProf)
# m2 <- clm(enteredResponse ~ macron*c.(maoriProf) + macron*type + c.(n.neighbors)*type*c.(maoriProf) + c.(length)*type*c.(maoriProf) + c.(score)*type*c.(maoriProf), data=dataExp1)
# anova(m2, type="III") # remove c.(maoriProf):type:c.(score)
# m3 <- clm(enteredResponse ~ macron*c.(maoriProf) + macron*type + c.(n.neighbors)*type*c.(maoriProf) + c.(length)*type*c.(maoriProf) + c.(score)*type + c.(score)*c.(maoriProf), data=dataExp1)
# anova(m3, type="III") # remove c.(maoriProf):type:c.(length) 
# m4 <- clm(enteredResponse ~ macron*c.(maoriProf) + macron*type + c.(n.neighbors)*type*c.(maoriProf) + c.(length)*type + c.(length)*c.(maoriProf) + c.(score)*type + c.(score)*c.(maoriProf), data=dataExp1)
# anova(m4, type="III") # remove type:c.(length) 
# m5 <- clm(enteredResponse ~ macron*c.(maoriProf) + macron*type + c.(n.neighbors)*type*c.(maoriProf) + c.(length)*c.(maoriProf) + c.(score)*type + c.(score)*c.(maoriProf), data=dataExp1)
# anova(m5, type="III") # remove macron:c.(maoriProf) 
# m6 <- clm(enteredResponse ~ macron*type + c.(n.neighbors)*type*c.(maoriProf) + c.(length)*c.(maoriProf) + c.(score)*type + c.(score)*c.(maoriProf), data=dataExp1)
# anova(m6, type="III") # remove c.(maoriProf):c.(length)
# m7 <- clm(enteredResponse ~ macron*type + c.(n.neighbors)*type*c.(maoriProf) + c.(length) + c.(score)*type + c.(score)*c.(maoriProf), data=dataExp1)
# anova(m7, type="III") # remove c.(maoriProf):c.(score)
# m8 <- clm(enteredResponse ~ macron*type + c.(n.neighbors)*type*c.(maoriProf) + c.(length) + c.(score)*type, data=dataExp1)
# anova(m8, type="III")
# m9 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron*type + c.(score)*type + c.(length) + (1 + c.(n.neighbors)*type + macron*type|workerId) + (1 + c.(score)*type + c.(length)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# Model didn't converge -> remove random slopes by workerId for all interactions
# m10 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron*type + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + type + macron + c.(score)|workerId) + (1 + c.(length)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# Model didn't converge -> remove a random slope (length) by workerId
# m11 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron*type + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + type + macron + c.(score)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# Model didn't converge -> remove random slopes (macron and length) by workerId
# m12 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron*type + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + c.(score) + type|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# Model didn't converge -> remove random slopes (macron, length and type) by workerId
# m13 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron*type + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# summary(m13) # model converged -> remove type:macron
# m14 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# anova(m13, m14) # not significant, choose m14 (remove type:macron) 
# summary(m14) # remove length
# m15 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron + c.(score)*type + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# anova(m14, m15) # not significant, choose m15 (remove length) 
# summary(m15) # (AIC: 71237.05) remove n.neighbors:type:maoriProf
# m16 <- clmm(enteredResponse ~ c.(n.neighbors)*type + type*c.(maoriProf) + c.(n.neighbors)*c.(maoriProf) + c.(score)*type + macron + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# anova(m15, m16) # significant, choose m15 (do not remove n.neighbors:type:maoriProf) 
# summary(m15) # remove type:score
# m17 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron + c.(score) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# anova(m15, m17) # significant, choose m15 (do not remove type:score) 
# summary(m15) # remove macron
# m18 <- clmm(enteredResponse ~ c.(n.neighbors)*type + type*c.(maoriProf) + c.(n.neighbors)*c.(maoriProf) + c.(score)*type + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# anova(m15, m18) # significant, choose m15 (do not remove macron) -> add random slopes (type and macron) by workerId to m15
# m15_1 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron + c.(score)*type + (1 + c.(n.neighbors) + c.(score) + macron + type|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# Model didn't converge -> add a random slope (type) by workerId to m15
# m15_2 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron + c.(score)*type + (1 + c.(n.neighbors) + c.(score) + type|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# Model didn't converge -> add a random slope (macron) by workerId to m15
# m15_3 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron + c.(score)*type + (1 + c.(n.neighbors) + c.(score) + macron|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# summary(m15_3) # model converged (AIC: 70810.69) 
# anova(m15, m15_3) # choose m15_3 (model with a lower AIC score) -> add an uncorrelated random effect (type) by workerId to m15_3
# m15_4 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriProf) + macron + c.(score)*type + (1 + c.(n.neighbors) + c.(score) + macron|workerId) + (0 + type|workerId) + (1 + c.(maoriProf)|word), data=dataExp1)
# Model didn't converge, stay with m15_3
# saveRDS(m15_3, file = "exp1MaoriProf.rds")
mExp1MaoriProf <- readRDS("./exp1MaoriProf.rds")
clm_table(mExp1MaoriProf, caption="Model summary of confidence ratings with participants’ self-rated Māori proficiency. All numeric variables in this model are centered.")
Model summary of confidence ratings with participants’ self-rated Māori proficiency. All numeric variables in this model are centered.
Parameter Estimate Std. Error \(z\) \(p\)
Effects n.neighbors (centered) 0.031 0.016 1.928 0.054 .
type = real 4.323 0.156 27.685 <0.001 ***
maoriProf (centered) 0.041 0.114 0.358 0.720
macron = TRUE 0.661 0.171 3.869 <0.001 ***
score (centered) 4.147 0.954 4.345 <0.001 ***
n.neighbors (centered) × type = real -0.010 0.020 -0.488 0.625
n.neighbors (centered) × maoriProf (centered) 0.014 0.005 2.601 0.009 **
type = real × maoriProf (centered) 0.476 0.043 11.105 <0.001 ***
type = real × score (centered) -2.929 1.458 -2.010 0.044 *
n.neighbors (centered) × type = real × maoriProf (centered) -0.015 0.005 -2.824 0.005 **
Thresholds 1|2 -3.024 0.145
2|3 -0.796 0.143
3|4 1.404 0.143
4|5 2.803 0.144

1.6.5 Effect plots from the model with participants’ self-rated Māori proficiency

Effect plots of: the interaction between the neighbourhood density, self-rated Māori proficiency and the distinction between non vs. real word stimuli (Fig.a); the interaction between the phonotactic score and the distinction between non vs. real word stimuli (Fig.b); and the presence of macrons (Fig.c). For plots involving interactions with paricipants' self-rated Māori proficiency, the upper panel represents participants with low proficiency, and the lower panel represents participants with high proficiency. Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

Effect plots of: the interaction between the neighbourhood density, self-rated Māori proficiency and the distinction between non vs. real word stimuli (Fig.a); the interaction between the phonotactic score and the distinction between non vs. real word stimuli (Fig.b); and the presence of macrons (Fig.c). For plots involving interactions with paricipants’ self-rated Māori proficiency, the upper panel represents participants with low proficiency, and the lower panel represents participants with high proficiency. Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

1.6.6 Modeling confidence ratings with participants’ self-rated exposure to Māori by an ordinal mixed-effects model

The model fitting procedure is described in the code below.

# Model fitting: fixed-effects and mixed-effects
# m1 <- clm(enteredResponse ~ macron*type*c.(maoriExpo) + c.(n.neighbors)*type*c.(maoriExpo) + c.(length)*type*c.(maoriExpo) + c.(score)*type*c.(maoriExpo), data=dataExp1)
# anova(m1, type="III") # remove type:c.(maoriExpo):c.(n.neighbors)
# m2 <- clm(enteredResponse ~ macron*type*c.(maoriExpo) + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length)*type*c.(maoriExpo) + c.(score)*type*c.(maoriExpo), data=dataExp1)
# anova(m2, type="III") # remove type:c.(maoriExpo):c.(score)
# m3 <- clm(enteredResponse ~ macron*type*c.(maoriExpo) + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length)*type*c.(maoriExpo) + c.(score)*type + c.(score)*c.(maoriExpo), data=dataExp1)
# anova(m3, type="III") # remove type:c.(maoriExpo):c.(length)
# m4 <- clm(enteredResponse ~ macron*type*c.(maoriExpo) + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length)*type + c.(length)*c.(maoriExpo) + c.(score)*type + c.(score)*c.(maoriExpo), data=dataExp1)
# anova(m4, type="III") # remove macron:type:c.(maoriExpo)
# m5 <- clm(enteredResponse ~ macron*type + macron*c.(maoriExpo) + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length)*type + c.(length)*c.(maoriExpo) + c.(score)*type + c.(score)*c.(maoriExpo), data=dataExp1)
# anova(m5, type="III") # remove type:c.(length) 
# m6 <- clm(enteredResponse ~ macron*type + macron*c.(maoriExpo) + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length)*c.(maoriExpo) + c.(score)*type + c.(score)*c.(maoriExpo), data=dataExp1)
# anova(m6, type="III") # remove c.(maoriExpo):c.(score) 
# m7 <- clm(enteredResponse ~ macron*type + macron*c.(maoriExpo) + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length)*c.(maoriExpo) + c.(score)*type, data=dataExp1)
# anova(m7, type="III") # remove c.(maoriExpo):c.(length)
# m8 <- clm(enteredResponse ~ macron*type + macron*c.(maoriExpo) + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length) + c.(score)*type, data=dataExp1)
# anova(m8, type="III") # remove macron:c.(maoriExpo) 
# m9 <- clm(enteredResponse ~ macron*type + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length) + c.(score)*type, data=dataExp1)
# anova(m9, type="III")
# m10 <- clmm(enteredResponse ~ macron*type + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length) + c.(score)*type + (1 + macron*type + c.(n.neighbors)*type + c.(length)|workerId) + (1 + c.(score)*type|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# Model didn't converge -> remove random slopes by workerId for all interactions
# m11 <- clmm(enteredResponse ~ macron*type + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length) + c.(score)*type + (1 + macron + type + c.(n.neighbors) + c.(length)|workerId) + (1 + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# Model didn't converge -> remove a random slope (length) by workerId
# m12 <- clmm(enteredResponse ~ macron*type + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length) + c.(score)*type + (1 + macron + type + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# Model didn't converge -> remove random slopes (length and macron) by workerId
# m13 <- clmm(enteredResponse ~ macron*type + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length) + c.(score)*type + (1 + type + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# Model didn't converge -> remove random slopes (length, macron, and type) by workerId
# m14 <- clmm(enteredResponse ~ macron*type + c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriExpo) + c.(length) + c.(score)*type + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# summary(m14) # Model converged -> remove n.neighbors:maoriExpo
# m15 <- clmm(enteredResponse ~ macron*type + c.(n.neighbors)*type + c.(score)*type + c.(maoriExpo) + c.(length) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# anova(m14, m15) # not significant, choose m15 (remove n.neighbors:maoriExpo) 
# summary(m15) # remove macron:type
# m16 <- clmm(enteredResponse ~ macron + c.(n.neighbors)*type + c.(score)*type + c.(maoriExpo) + c.(length) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# anova(m15, m16) # not significant, choose m16 (remove macron:type) 
# summary(m16) # remove n.neighbors:type
# m17 <- clmm(enteredResponse ~ macron + c.(n.neighbors) + c.(score)*type + c.(maoriExpo) + c.(length) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# anova(m16, m17) # not significant, choose m17 (remove n.neighbors:type) 
# summary(m17) # remove length
# m18 <- clmm(enteredResponse ~ macron + c.(n.neighbors) + c.(score)*type + c.(maoriExpo) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# anova(m17, m18) # not significant, choose m18 (remove length)  
# summary(m18) # remove score:type
# m19 <- clmm(enteredResponse ~ macron + c.(n.neighbors) + c.(score) + type + c.(maoriExpo) + (1 + c.(n.neighbors) + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# anova(m18, m19) # not significant, choose m19 (remove score:type) 
# summary(m19) # AIC: 71396.90, remove n.neighbors
# m20 <- clmm(enteredResponse ~ macron + c.(score) + type + c.(maoriExpo) + (1 + c.(score)|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# anova(m19, m20) # significant, choose m19 (do not remove n.neighbors) -> add random slopes (type and macron) by workerId to m19
# m19_1 <- clmm(enteredResponse ~ macron + c.(n.neighbors) + c.(score) + type + c.(maoriExpo) + (1 + c.(n.neighbors) + c.(score) + type + macron|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# Model didn't converge -> add a random slope (type) by workerId to m19
# m19_2 <- clmm(enteredResponse ~ macron + c.(n.neighbors) + c.(score) + type + c.(maoriExpo) + (1 + c.(n.neighbors) + c.(score) + type|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# Model didn't converge -> add a random slope (macron) by workerId to m19
# m19_3 <- clmm(enteredResponse ~ macron + c.(n.neighbors) + c.(score) + type + c.(maoriExpo) + (1 + c.(n.neighbors) + c.(score) + macron|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# anova(m19, m19_3) #choose m19_3 (model with a lower AIC score)
# summary(m19_3) # AIC: 70954.15, remove n.neighbors
# m20 <- clmm(enteredResponse ~ macron + c.(score) + type + c.(maoriExpo) + (1 + c.(score) + macron|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# anova(m19_3, m20) # significant, choose m19_3 (do not remove n.neighbors) -> add an uncorrelated random effect (type) by workerId to m19_3
# m19_4 <- clmm(enteredResponse ~ macron + c.(n.neighbors) + c.(score) + type + c.(maoriExpo) + (1 + c.(n.neighbors) + c.(score) + macron|workerId) + (0 + type|workerId) + (1 + c.(maoriExpo)|word), data=dataExp1)
# Model didn't converged, stay with m19_3
# saveRDS(m19_3, file = "exp1MaoriExpo.rds")
mExp1MaoriExpo <- readRDS("./exp1MaoriExpo.rds") 
clm_table(mExp1MaoriExpo, caption="Model summary of confidence ratings with participants' self-rated exposure to Māori. All numeric variables in this model are centered.")
Model summary of confidence ratings with participants’ self-rated exposure to Māori. All numeric variables in this model are centered.
Parameter Estimate Std. Error \(z\) \(p\)
Effects macron = TRUE 0.727 0.171 4.253 <0.001 ***
n.neighbors (centered) 0.020 0.011 1.871 0.061 .
score (centered) 2.657 0.791 3.358 <0.001 ***
type = real 4.422 0.171 25.923 <0.001 ***
maoriExpo (centered) 0.098 0.047 2.095 0.036 *
Thresholds 1|2 -2.886 0.141
2|3 -0.658 0.140
3|4 1.546 0.140
4|5 2.941 0.141

1.6.7 Effect plots from the model with participants’ self-rated exposure to Māori

Effect plots of: phonotactic score (Fig.a); the presence of macrons (Fig.b); the distinction between non vs. real word stimuli (Fig.c); participants' self-rated exposure to Māori (Fig.d); and neighbourhood density (Fig.e). Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

Effect plots of: phonotactic score (Fig.a); the presence of macrons (Fig.b); the distinction between non vs. real word stimuli (Fig.c); participants’ self-rated exposure to Māori (Fig.d); and neighbourhood density (Fig.e). Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

1.6.8 Modeling confidence ratings with participants’ self-rated basic knowledge of Māori by an ordinal mixed-effects model

The model fitting procedure is described in the code below.

# Model fitting: fixed-effects and mixed-effects
# m1 <- clm(enteredResponse ~ macron*type*c.(maoriList) + c.(n.neighbors)*type*c.(maoriList) + c.(length)*type*c.(maoriList) + c.(score)*type*c.(maoriList), data=dataExp1)
# anova(m1, type="III") # remove type:c.(maoriList):c.(score)
# m2 <- clm(enteredResponse ~ macron*type*c.(maoriList) + c.(n.neighbors)*type*c.(maoriList) + c.(length)*type*c.(maoriList) + c.(score)*c.(maoriList) + c.(score)*type, data=dataExp1)
# anova(m2, type="III") # remove macron:type:c.(maoriList) 
# m3 <- clm(enteredResponse ~ macron*c.(maoriList) + macron*type + c.(n.neighbors)*type*c.(maoriList) + c.(length)*type*c.(maoriList) + c.(score)*c.(maoriList) + c.(score)*type, data=dataExp1)
# anova(m3, type="III") # remove c.(maoriList):type:c.(length) 
# m4 <- clm(enteredResponse ~ macron*c.(maoriList) + macron*type + c.(n.neighbors)*type*c.(maoriList) + c.(length)*type + c.(length)*c.(maoriList) + c.(score)*c.(maoriList) + c.(score)*type, data=dataExp1)
# anova(m4, type="III") # remove type:c.(length)  
# m5 <- clm(enteredResponse ~ macron*c.(maoriList) + macron*type + c.(n.neighbors)*type*c.(maoriList) + c.(length)*c.(maoriList) + c.(score)*c.(maoriList) + c.(score)*type, data=dataExp1)
# anova(m5, type="III") # remove c.(maoriList):c.(length)  
# m6 <- clm(enteredResponse ~ macron*c.(maoriList) + macron*type + c.(n.neighbors)*type*c.(maoriList) + c.(length) + c.(score)*c.(maoriList) + c.(score)*type, data=dataExp1)
# anova(m6, type="III")
# m7 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + macron*type + c.(score)*c.(maoriList) + c.(score)*type + c.(length) + (1 + macron*type + c.(n.neighbors)*type| workerId) + (1 + c.(score)*type + c.(length)|workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# Model didn't converge -> remove random slopes by workerId for all interactions
# m8 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + macron*type + c.(score)*c.(maoriList) + c.(score)*type + c.(length) + (1 + macron + type + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(length)|workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# Model didn't converge -> remove a random slope (length) by workerId
# m9 <- clmm(enteredResponse ~ macron*c.(maoriList) + macron*type + c.(n.neighbors)*type*c.(maoriList) + c.(length) + c.(score)*c.(maoriList) + c.(score)*type + (1 + macron + type + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# Model didn't converge -> remove random slopes (length and macron) by workerId
# m10 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + macron*type + c.(score)*c.(maoriList) + c.(score)*type + c.(length) + (1 + type + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# summary(m10) # remove type:macron
# m11 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + c.(score)*c.(maoriList) + c.(score)*type + c.(length) + (1 + type + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# Model didn't converge -> remove random slopes (length, macron and type) by workerId
# m12 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + c.(score)*c.(maoriList) + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# summary(m12) -> model converged -> rerun m10 after removing random slopes (type, length and macron) by workerId
# m10_1 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + macron*type + c.(score)*c.(maoriList) + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# anova(m10_1, m12) -> not significant, choose m12 (remove type:macron) 
# summary(m12) # remove maoriList:score
# m13 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# anova(m12, m13) # not significant, choose m13 (remove maoriList:score) 
# summary(m13) # remove n.neighbors:type:maoriList
# m14 <- clmm(enteredResponse ~ c.(n.neighbors)*type + c.(n.neighbors)*c.(maoriList) + type*c.(maoriList) + macron*c.(maoriList) + c.(score)*type + c.(length) + (1 + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# anova(m13, m14) # significant, choose m13 (do not remove n.neighbors:type:maoriList) 
# summary(m13) # remove length
# m15 <-  clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + c.(score)*type + (1 + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# anova(m13, m15) # not significant, choose m15 (remove length) 
# summary(m15) # remove type:score
# m16 <-  clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + c.(score) + (1 + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# anova(m15, m16) # not significant, choose m16 (remove type:score) 
# summary(m16) # remove macron:maoriList
# m17 <-  clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron + c.(score) + (1 + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# anova(m16, m17) # significant, choose m16 (do not remove macron:maoriList) 
# summary(m16) # add a random slope (type) by workerId to m16
# m16_1 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + c.(score) + (1 + type + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# Model didn't converge -> add a random slope (macron) by workerId to m16
# m16_2 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron*c.(maoriList) + c.(score) + (1 + macron + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# summary(m16_2) # AIC: 70297.76 
# anova(m16, m16_2) -> choose m16_2 (model with a lower AIC score) 
# summary(m16_2) # remove macron:maoriList
# m18 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron + c.(score) + (1 + macron + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# summary(m18) # AIC: 70299.17
# anova(m16_2, m18) # not significant, choose m18 (remove macron:maoriList) -> add a random slope (type) by workerId to m18
# m18_1 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron + c.(score) + (1 + macron + type + c.(n.neighbors) + c.(score)| workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# summary(m18_1) -> model didn't converge, stay with m18 -> add an uncorrelated random effect (type) by workerId to m18
# m18_2 <- clmm(enteredResponse ~ c.(n.neighbors)*type*c.(maoriList) + macron + c.(score) + (1 + macron + c.(n.neighbors) + c.(score)| workerId) + (0 + type | workerId) + (1 + c.(maoriList)|word), data=dataExp1)
# summary(m18_2) # AIC: 68789.22
# anova(m18, m18_2) # choose m18_2 (model with a lower AIC score) 
# saveRDS(m18_2, file = "exp1MaoriList.rds")
mExp1MaoriList <- readRDS("./exp1MaoriList.rds")   
clm_table(mExp1MaoriList, caption="Model summary of confidence ratings with participants’ self-rated basic knowledge of Māori. All numeric variables in this model are centered.")
Model summary of confidence ratings with participants’ self-rated basic knowledge of Māori. All numeric variables in this model are centered.
Parameter Estimate Std. Error \(z\) \(p\)
Effects n.neighbors (centered) 0.030 0.017 1.796 0.073 .
type = real 4.585 0.228 20.145 <0.001 ***
maoriList (centered) 0.067 0.050 1.322 0.186
macron = TRUE 0.757 0.181 4.183 <0.001 ***
score (centered) 2.806 0.795 3.530 <0.001 ***
n.neighbors (centered) × type = real -0.020 0.021 -0.939 0.348
n.neighbors (centered) × maoriList (centered) 0.008 0.002 4.088 <0.001 ***
type = real × maoriList (centered) 0.388 0.068 5.697 <0.001 ***
n.neighbors (centered) × type = real × maoriList (centered) -0.009 0.002 -4.185 <0.001 ***
Thresholds 1|2 -3.084 0.163
2|3 -0.786 0.161
3|4 1.472 0.161
4|5 2.954 0.162

1.6.9 Effect plots from the model with participants’ basic knowledge of Māori

Effect plots of: the interaction between the neighbourhood density, self-rated basic knowledge of Māori and the distinction between non vs. real word stimuli (Fig.a); phonotactic score (Fig.b); and the presence of macrons (Fig.c). For plots involving interactions with participants' basic knowledge of Māori, the upper panel represents participants with high level of knowledge, and the lower panel represents participants with low level of knowledge. Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

Effect plots of: the interaction between the neighbourhood density, self-rated basic knowledge of Māori and the distinction between non vs. real word stimuli (Fig.a); phonotactic score (Fig.b); and the presence of macrons (Fig.c). For plots involving interactions with participants’ basic knowledge of Māori, the upper panel represents participants with high level of knowledge, and the lower panel represents participants with low level of knowledge. Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

2 Experiment 2

2.1 Participant exclusions

The participant filtering process of Experiment 2 is detailed in the code below.

# Loading the data to filter out participants
dataExp2 <- read.delim("./dataAnonNotFilteredExp2_added.txt", sep ="\t", header = TRUE, encoding="UTF-8")

# Remove 9 participants whose speakMaori or compMaori is equal to or above 3
rmParticipant1Exp2 <- unique(dataExp2[dataExp2$speakMaori >= 3 | dataExp2$compMaori >= 3,]$workerId) 
# 1b817b93 f44d2e0f 2d528dd7 90e5e8cd aa592fef a95bbe09 35ace421 b3f52c92 e7a6cb02
dataExp2 <- dataExp2[!dataExp2$workerId %in% rmParticipant1Exp2,]

# Remove one participant who did not learn their English in NZ and have been living overseas for more than two years (duration == "long")
summaryExp2WorkerId <- unique(dataExp2[,c("workerId","firstLangCountry","place","duration")])
EngNotInNZExp2 <- summaryExp2WorkerId[!summaryExp2WorkerId$firstLangCountry=="NZ",]
rmParticipant2Exp2 <- unique(EngNotInNZExp2[EngNotInNZExp2$place=="overseas",]$workerId) 
# 1e48b18a
dataExp2 <- dataExp2[!dataExp2$workerId %in% rmParticipant2Exp2,]

# Detect participant whose median reactionTime is shorter than 2*SD below the mean of all participants
median_RT <- aggregate(dataExp2$reactionTime, by=list(dataExp2$workerId), median)
names(median_RT) <- c("workerId","median")
cut <- mean(median_RT$median)-2*sd(median_RT$median)
# median_RT[!median_RT$median > cut,]$workerId # None detected!

# Remove a participant with joke answers
dataExp2 <- dataExp2[!dataExp2$workerId=="eaed6b4d",]

# Check the total number of usable participants for Exp2
# length(unique(dataExp2$workerId)) # 123

# Add a column to indicate the presence of macrons
dataExp2$macron <- FALSE
dataExp2[grepl("ā|ē|ī|ō|ū",dataExp2$word),]$macron <- TRUE
dataExp2$macron <- as.factor(dataExp2$macron)

2.2 Dataset structure

The data is structured as follows:

  • workerId is the unique ID for each participant.
  • definition is each participant’s entered definition for each stimulus.
  • coding is the marking for each definition.
  • correct is binary: either the definition is correct (TRUE) or incorrect (FALSE).
  • confidence is each participant’s confidence rating for each definition. (on a scale from 0 to 5).
  • familiarity is the average rating for each Māori word obtained from 101 NMS New Zealanders in Experiment 1.
  • reactionTime is the reaction time for each rating (seconds).
  • length is the phoneme length of each stimulus.
  • word is the stimulus used for the rating.
  • speakMaori is each participant’s report of how well they can speak Māori (on a scale from 0 to 5).
  • compMaori is each participant’s report of how well they can understand/read Māori (on a scale from 0 to 5).
  • maoriProf is the sum of quantified response for speakMaori and compMaori (participant Māori proficiency).
  • age is the age group for each participant.
  • gender is the gender reported by each participant.
  • ethnicity is categorized into binary answers, either Māori (M) or non Māori (non M).
  • education is each participant’s highest level of education.
  • children is each participant’s report of whether they have had any children who have attended preschool or primary school in New Zealand in the past five years.
  • maoriList is each participant’s basic knowledge of Māori (with a scale ranging from 0 to 9).
  • place is each participant’s current place of living (3 levels: NZ North Island, NZ South Island, or Overseas).
  • duration is each participant’s time living in their current place (2 levels: long is > 2 years; short is =< 2 years).
  • firstLang is each participant’s first language.
  • firstLangCountry is the country where each participant learned their first language.
  • anyOtherLangs is any other languages each participant reports speaking.
  • hawaii is the binary response to the question whether participants have lived in Hawaii.
  • anyPolynesian is the binary response to the question whether participants know any Polynesian such as Hawaiian, Tahitian, Sāmoan, or Tongan.
  • whichPolynesian is the information regarding participants’ knowledge of Polynesian languages, if they know any.
  • impairments is the answer to the question whether participants have a history of any speech or language impairments.
  • maoriExpo is each participant’s level of exposure to Māori (with a scale ranging from 0 to 10).
  • score is the phonotactic score based on the 1,629 most frequent morph types derived from all words in the dictionary, normalized by the phonemic length of stimuli, ignoring vowel length distinctions, assuming that participants are attempting to parse stimuli into morphs.
  • n.neighbors is the the number of words (from the Māori dictionary) that can be reached by adding, deleting, or substituting one phoneme in each stimulus.
  • mean.neighbor.logfreq is the frequency-weighted phonological neighbourhood density.

2.3 Overview of participants’ sociolinguistic profile in Experiment 2

Overview of participants’ sociolinguistic profile in Experiment 2. Bars are labeled with their counts for each category.

Overview of participants’ sociolinguistic profile in Experiment 2. Bars are labeled with their counts for each category.

2.4 Rate of accurate definitions per word

Rate of accurate definitions per word. Words are displayed according to their phonotactic score on the x-axis and their accuracy rates are represented on the y-axis. Overlapping labels are not shown.

Rate of accurate definitions per word. Words are displayed according to their phonotactic score on the x-axis and their accuracy rates are represented on the y-axis. Overlapping labels are not shown.

2.5 Statistical analyses

2.5.1 Comparing the AIC score of four statistical models with measures for evaluationg participants

Comparison of the AIC score (Experiment 2)
Evaluation measure AIC
Basic knowledge 12327.5
Proficiency 12509.0
Exposure 12513.5

2.5.2 Modeling accuracy with participants’ basic knowledge of Māori by a mixed-effects binary logistic regression model

The model fitting procedure is described in the code below.

# Model fitting: fixed-effects & mixed-effects
# m1 <- glm(correct ~ (c.(score) + c.(n.neighbors) + c.(familiarity) + c.(length) + macron)*c.(maoriList), data=dataExp2, family=binomial(link="logit"))
# summary(m1) # remove score:maoriList
# m2 <- update(m1, . ~ . -c.(score):c.(maoriList))
# anova(m1, m2, test="Chisq") # not significant, choose m2 (remove score:maoriList)
# summary(m2) # remove macron:maoriList
# m3 <- update(m2, . ~ . -macron:c.(maoriList))
# anova(m2, m3, test="Chisq") # not significant, choose m3 (remove macron:maoriList)
# summary(m3) # remove length:maoriList
# m4 <- update(m3, . ~ . -c.(length):c.(maoriList)) 
# anova(m3, m4, test="Chisq") # not significant, choose m4 (remove length)
# summary(m4) # remove score
# m5 <- update(m4, . ~ . -c.(score))
# anova(m4, m5, test="Chisq") # not significant, choose m5 (remove score)
# summary(m5) # remove n.neighbors:maoriList
# m6 <- update(m5, . ~ . -c.(n.neighbors):c.(maoriList))
# anova(m5, m6, test="Chisq") # significant, choose m5 (do not remove n.neighbors:maoriList)
# summary(m5) # remove familiarity:maoriList
# m7 <- update(m5, . ~ . -c.(familiarity):c.(maoriList))
# anova(m5, m7, test="Chisq") # significant, choose m5 (do not remove familiarity:maoriList)
# m8 <- glmer(correct ~ c.(n.neighbors)*c.(maoriList) + c.(familiarity)*c.(maoriList) + c.(length) + macron + (1 + c.(length) + c.(n.neighbors) + c.(familiarity) + macron|workerId) + (1+ c.(maoriList)|word), family = binomial(link = "logit"), data = dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)))
# summary(m8) # singular fit, remove a random slope (length) by workerId 
# m9 <- glmer(correct ~ c.(n.neighbors)*c.(maoriList) + c.(familiarity)*c.(maoriList) + c.(length) + macron + (1 + c.(n.neighbors) + c.(familiarity) + macron|workerId) + (1+ c.(maoriList)|word), family = binomial(link = "logit"), data = dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)))
# summary(m9) # singular fit, remove random slopes (length and n.neighbors) by workerId 
# m10 <- glmer(correct ~ c.(n.neighbors)*c.(maoriList) + c.(familiarity)*c.(maoriList) + c.(length) + macron + (1 + macron + c.(familiarity)|workerId) + (1+ c.(maoriList)|word), family = binomial(link = "logit"), data = dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)))
# summary(m10)  # remove length
# m11 <- glmer(correct ~ c.(n.neighbors)*c.(maoriList) + c.(familiarity)*c.(maoriList) + macron + (1 + macron + c.(familiarity)|workerId) + (1+ c.(maoriList)|word), family = binomial(link = "logit"), data = dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)))
# anova(m10, m11) # not significant, choose m11 (remove length)
# summary(m11) # remove n.neighbors:maoriList
# m12 <- glmer(correct ~ c.(n.neighbors) + c.(familiarity)*c.(maoriList) + macron + (1 + macron + c.(familiarity)|workerId) + (1+ c.(maoriList)|word), family = binomial(link = "logit"), data = dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)))
# anova(m11, m12) # not significant, choose m12 (remove n.neighbors:maoriList)
# summary(m12) # remove n.neighbors
# m13 <- glmer(correct ~ c.(familiarity)*c.(maoriList) + macron + (1 + macron + c.(familiarity)|workerId) + (1+ c.(maoriList)|word), family = binomial(link = "logit"), data = dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)))
# anova(m12, m13) # not significant, choose m13 (remove n.neighbors)
# summary(m13) # AIC: 12327.53, remove familiarity:maoriList
# m14 <- glmer(correct ~ c.(familiarity) + c.(maoriList) + macron + (1 + macron + c.(familiarity)|workerId) + (1+ c.(maoriList)|word), family = binomial(link = "logit"), data = dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)))
# anova(m13, m14) # significant, choose m13 (do not remove familiarity:maoriList) -> add an uncorrelated random effect (n.neighbors) by workerId to m13
# m13_1 <- glmer(correct ~ c.(familiarity)*c.(maoriList) + macron + (1 + macron + c.(familiarity)|workerId) + (0 + c.(n.neighbors)|workerId) + (1+ c.(maoriList)|word), family = binomial(link = "logit"), data = dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)))
# summary(m13_1) # 12329.52
# anova(m13, m13_1) # choose m13 (model with a lower AIC score)
# saveRDS(m13, file = "exp2MaoriList.rds")
mExp2MaoriList <- readRDS("./exp2MaoriList.rds")
logistic_table(mExp2MaoriList, caption="Model summary of accuracy with participants’ self-rated basic knowledge of Māori. All numeric variables in this model are centered.")
Model summary of accuracy with participants’ self-rated basic knowledge of Māori. All numeric variables in this model are centered.
Parameter Estimate Std. Error \(z\) \(p\)
(Intercept) 0.949 0.166 5.728 <0.001 ***
familiarity (centered) 5.129 0.469 10.932 <0.001 ***
maoriList (centered) 0.459 0.043 10.688 <0.001 ***
macron = TRUE -0.850 0.292 -2.909 0.004 **
familiarity (centered) × maoriList (centered) -0.208 0.086 -2.428 0.015 *

2.5.3 Effect plots from the model with participants’ basic knowledge of Māori

Effect plots of: the interaction between familiarity and participants’ basic knowledge of Māori (Fig.a); and the presence of macrons (Fig.b).

Effect plots of: the interaction between familiarity and participants’ basic knowledge of Māori (Fig.a); and the presence of macrons (Fig.b).

2.5.4 Modeling accuracy with participants’ self-rated Māori proficiency by a mixed-effects binary logistic regression model

The model fitting procedure is described in the code below.

# Model fitting: fixed-effects and mixed-effects
# m1 <- glm(correct ~ (c.(score) + c.(n.neighbors) + c.(familiarity) + c.(length) + macron)*c.(maoriProf), data=dataExp2, family=binomial(link="logit"))
# summary(m1) # remove score:maoriProf
# m2 <- update(m1, . ~ . -c.(score):c.(maoriProf))
# anova(m1, m2, test="Chisq") # not significant, choose m2 (remove score:maoriProf) 
# summary(m2) # remove macron:maoriProf
# m3 <- update(m2, . ~ . -macron:c.(maoriProf))
# anova(m2, m3, test="Chisq") # not significant, choose m3 (remove macron:maoriProf) 
# summary(m3) # remove length:maoriProf
# m4 <- update(m3, . ~ . -c.(length):c.(maoriProf))
# anova(m3, m4, test="Chisq") -> not significant, choose m4 
# summary(m4) # remove n.neighbors:maoriProf
# m5 <- update(m4, . ~ . -c.(n.neighbors):c.(maoriProf))
# anova(m4, m5, test="Chisq") # not significant, choose m5 
# summary(m5) # remove score
# m6 <- update(m5, . ~ . -c.(score))
# anova(m5, m6, test="Chisq") # not significant, choose m6
# m7 <- glmer(correct ~ c.(familiarity)*c.(maoriProf) + c.(n.neighbors) + c.(length) + macron + (1 + c.(familiarity) + c.(n.neighbors) + c.(length) + macron|workerId) + (1+ c.(maoriProf)|word), data=dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# summary(m7) # remove length
# m8 <- glmer(correct ~ c.(familiarity)*c.(maoriProf) + c.(n.neighbors) + macron + (1 + c.(familiarity) + c.(n.neighbors) + macron|workerId) + (1+ c.(maoriProf)|word), data=dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# anova(m7, m8) # not significant, choose m8 (remove length)
# summary(m8) # remove n.neighbors
# m9 <- glmer(correct ~ c.(familiarity)*c.(maoriProf) + macron + (1 + c.(familiarity) + macron|workerId) + (1+ c.(maoriProf)|word), data=dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# anova(m8, m9) # not significant, choose m9 (remove n.neighbors) -> remove familiarity:maoriProf
# m10 <- glmer(correct ~ c.(familiarity) + c.(maoriProf) + macron + (1 + c.(familiarity) + macron|workerId) + (1+ c.(maoriProf)|word), data=dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# anova(m9, m10) # significant, choose m9 (do not remove familiarity:maoriProf)
# saveRDS(m9, file = "exp2MaoriProf.rds")
mExp2MaoriProf <- readRDS("./exp2MaoriProf.rds")
logistic_table(mExp2MaoriProf, caption="Model summary of accuracy with participants’ self-rated Māori proficiency. All numeric variables in this model are centered.")
Model summary of accuracy with participants’ self-rated Māori proficiency. All numeric variables in this model are centered.
Parameter Estimate Std. Error \(z\) \(p\)
(Intercept) 0.864 0.176 4.901 <0.001 ***
familiarity (centered) 4.926 0.448 10.990 <0.001 ***
maoriProf (centered) 0.725 0.125 5.807 <0.001 ***
macron = TRUE -0.742 0.296 -2.507 0.012 *
familiarity (centered) × maoriProf (centered) -0.548 0.172 -3.185 0.001 **

2.5.5 Effect plots from the model with participants’ self-rated Māori proficiency

Effect plots of: the interaction between familiarity and self-rated Māori proficiency (Fig.a); and the presence of macrons (Fig.b).

Effect plots of: the interaction between familiarity and self-rated Māori proficiency (Fig.a); and the presence of macrons (Fig.b).

2.5.6 Modeling accuracy with participants’ self-reported exposure to Māori by a mixed-effects binary logistic regression model

The model fitting procedure is described in the code below.

# Model fitting: fixed-effects and mixed-effects
# m1 <- glm(correct ~ (c.(score) + c.(n.neighbors) + c.(familiarity) + c.(length) + macron)*c.(maoriExpo), data=dataExp2, family=binomial(link="logit"))
# summary(m1) # remove length:maoriExpo
# m2 <- update(m1, . ~ . -c.(length):c.(maoriExpo))
# anova(m1, m2, test="Chisq") # not significant, choose m2 (remove length:maoriExpo) 
# summary(m2) # remove macron:maoriExpo
# m3 <- update(m2, . ~ . -macron:c.(maoriExpo))
# anova(m2, m3, test="Chisq") # not significant, choose m3 (remove macron:maoriExpo) 
# summary(m3) # remove score:maoriExpo
# m4 <- update(m3, . ~ . -c.(score):c.(maoriExpo))
# anova(m3, m4, test="Chisq") # not significant, choose m4 (remove score:maoriExpo) 
# summary(m4) # remove score
# m5 <- update(m4, . ~ . -c.(score))
# anova(m4, m5, test="Chisq") # significant, choose m5 (remove score) 
# summary(m5) # remove familiarity:maoriExpo
# m6 <- update(m5, . ~ . -c.(familiarity):c.(maoriExpo)) 
# anova(m5, m6, test="Chisq") # significant, choose m5 (do not remove familiarity:maoriExpo) 
# summary(m5) # remove n.neighbors:maoriExpo
# m7 <- update(m5, . ~ . -c.(n.neighbors):c.(maoriExpo))
# anova(m5, m7, test="Chisq") # significant, choose m5 (do not remove n.neighbors:maoriExpo)
# m8 <- glmer(correct ~ c.(n.neighbors)*c.(maoriExpo) + c.(familiarity)*c.(maoriExpo) + macron + c.(length) + (1 + c.(familiarity) + c.(n.neighbors) + macron + c.(length)|workerId) + (1+ c.(maoriExpo)|word), data=dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# summary(m8) # singular fit -> remove a random slope (length) by workerId
# m9 <- glmer(correct ~ c.(n.neighbors)*c.(maoriExpo) + c.(familiarity)*c.(maoriExpo) + macron + c.(length) + (1 + c.(familiarity) + c.(n.neighbors) + macron|workerId) + (1+ c.(maoriExpo)|word), data=dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# summary(m9) # remove maoriExpo:familiarity
# m10 <- glmer(correct ~ c.(n.neighbors)*c.(maoriExpo) + c.(familiarity) + macron + c.(length) + (1 + c.(familiarity) + c.(n.neighbors) + macron|workerId) + (1+ c.(maoriExpo)|word), data=dataExp2, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# anova(m9, m10) # not significant, choose m10 (remove maoriExpo:familiarity)
# summary(m10) # remove length
# m11 <- glmer(correct ~ c.(n.neighbors)*c.(maoriExpo) + c.(familiarity) + macron + (1 + c.(familiarity) + c.(n.neighbors) + macron|workerId) + (1+ c.(maoriExpo)|word), data=dataExp2, control = glmerControl(optimizer = "bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# anova(m10, m11) # not significant, choose m11 (remove length)
# summary(m11) # singular fit -> remove a random slope (n.neighbors) by workerId
# m12 <- glmer(correct ~ c.(n.neighbors)*c.(maoriExpo) + c.(familiarity) + macron + (1 + c.(familiarity) + macron|workerId) + (1+ c.(maoriExpo)|word), data=dataExp2, control = glmerControl(optimizer = "bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# summary(m12) # AIC: 12513.45, remove n.neighbors:maoriExpo
# m13 <- glmer(correct ~ c.(n.neighbors) + c.(maoriExpo) + c.(familiarity) + macron + (1 + c.(familiarity) + macron|workerId) + (1+ c.(maoriExpo)|word), data=dataExp2, control = glmerControl(optimizer = "bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# anova(m12, m13) # significant, choose m12 (do not remove n.neighbors:maoriExpo) -> -> add an uncorrelated random effect (n.neighbors) by workerId to m12
# m12_1 <- glmer(correct ~ c.(n.neighbors)*c.(maoriExpo) + c.(familiarity) + macron + (1 + c.(familiarity) + macron|workerId)  + (0 + c.(n.neighbors)|workerId) + (1+ c.(maoriExpo)|word), data=dataExp2, control = glmerControl(optimizer = "bobyqa", optCtrl=list(maxfun=1e6)), family=binomial(link="logit"))
# summary(m12_1) #AIC: 12515.45
# anova(m12, m12_1) # choose m12 (model with a lower AIC score)
# saveRDS(m12, file = "exp2MaoriExpo.rds")
mExp2MaoriExpo <- readRDS("./exp2MaoriExpo.rds")   
logistic_table(mExp2MaoriExpo, caption="Model summary of accuracy with participants’ self-reported exposure to Māori. All numeric variables in this model are centered.")
Model summary of accuracy with participants’ self-reported exposure to Māori. All numeric variables in this model are centered.
Parameter Estimate Std. Error \(z\) \(p\)
(Intercept) 0.914 0.173 5.280 <0.001 ***
n.neighbors (centered) -0.003 0.014 -0.198 0.843
maoriExpo (centered) 0.384 0.053 7.278 <0.001 ***
familiarity (centered) 4.941 0.435 11.357 <0.001 ***
macron = TRUE -0.890 0.295 -3.020 0.003 **
n.neighbors (centered) × maoriExpo (centered) -0.004 0.002 -2.391 0.017 *

2.5.7 Effect plots from the model with participants’ self-rated exposure to Māori

Effect plots of: the interaction between neighbourhood density and participants' exposure to Māori (Fig.a); the presence of macrons (Fig.b); and familiarity (Fig.c).Effect plots of: the interaction between neighbourhood density and participants' exposure to Māori (Fig.a); the presence of macrons (Fig.b); and familiarity (Fig.c).

Effect plots of: the interaction between neighbourhood density and participants’ exposure to Māori (Fig.a); the presence of macrons (Fig.b); and familiarity (Fig.c).

3 Supporting Information

3.1 Appendix A: Post-questionnaire

  1. How well are you able to speak Māori?
    \(\square\) Very well (I can talk about almost anything in Māori)
    \(\square\) Well (I can talk about many things in Māori)
    \(\square\) Fairly well (I can talk about some things in Māori)
    \(\square\) Not very well (I can only talk about simple/basic things in Māori)
    \(\square\) No more than a few words or phrases
    \(\square\) Not at all

  2. How well are you able to understand/read Māori?
    \(\square\) Very well (I can understand almost anything said/written in Māori)
    \(\square\) Well (I can understand many things said/written in Māori)
    \(\square\) Fairly well (I can understand some things said/written in Māori
    \(\square\) Not very well (I can only understand simple/basic things said/written in Māori)
    \(\square\) No more than a few words or phrases
    \(\square\) Not at all

  3. Which age group do you belong to?
    \(\square\) 18 - 29
    \(\square\) 30 - 39
    \(\square\) 40 - 49
    \(\square\) 50 - 59
    \(\square\) +60

  4. Please state your gender:

  5. Please state your ethnicity:

  6. Your highest education is:
    \(\square\) High school
    \(\square\) Undergraduate degree
    \(\square\) Graduate degree

  7. How often do you think you are exposed to the Māori language in your daily life, by means of Māori radio, Māori TV, online media?
    \(\square\) Less than once a year
    \(\square\) Less than once a month
    \(\square\) Less than once a week
    \(\square\) Less than once a day
    \(\square\) Multiple times a day

  8. How often do you think you are exposed to Māori language in your daily life, in conversation at work, at home, in social settings?
    \(\square\) Less than once a year
    \(\square\) Less than once a month
    \(\square\) Less than once a week
    \(\square\) Less than once a day
    \(\square\) Multiple times a day

  9. In the past five years, have you had any children living with you who have attended preschool or primary school in New Zealand?
    \(\square\) Yes
    \(\square\) No

  10. Please tick all boxes that apply.
    \(\square\) I can give a mihi in Māori.
    \(\square\) I can sing a few songs in Māori.
    \(\square\) I can sing the NZ national anthem in Māori.
    \(\square\) I know how to say some basic phrases (e.g. My name is…, I’m from…) in Māori.
    \(\square\) I know how to say some commands (e.g. Sit down / Come here) in Māori.
    \(\square\) I know how to say some greetings in Māori.
    \(\square\) I know how to say some numbers in Māori.
    \(\square\) I know how to say some body parts in Māori.
    \(\square\) I know how to say some colors in Māori.

  11. What region of New Zealand do you live in currently? (Please choose ``overseas” if you are living outside of New Zealand.)
    \(\square\) Northland
    \(\square\) Auckland
    \(\square\) Waikato
    \(\square\) Bay of Plenty
    \(\square\) Gisborne
    \(\square\) Hawke’s Bay
    \(\square\) Taranaki
    \(\square\) Wanganui
    \(\square\) Manawatu
    \(\square\) Wairarapa
    \(\square\) Wellington
    \(\square\) Nelson Bays
    \(\square\) Marlborough
    \(\square\) West Coast
    \(\square\) Canterbury
    \(\square\) Timaru - Oamaru
    \(\square\) Otago
    \(\square\) Southland
    \(\square\) Overseas

  12. How long have you been living there?

  13. Please state your first language (the language you speak/use most of your time).

  14. What country were you living in when you first learned this language?

  15. Please list any other languages that you can speak fluently:

  16. Have you ever lived in Hawaii?
    \(\square\) Yes
    \(\square\) No

  17. Do you speak/understand any Polynesian languages such as Hawaiian, Tahitian, Sāmoan, or Tongan?
    \(\square\) Yes
    \(\square\) No

  18. If you replied yes to question 17, please state the language you know.

  19. Do you have a history of any speech or language impairments that you are aware of?
    \(\square\) Yes \(\square\) No

3.2 Appendix B: Stimulus materials for Experiments

3.2.1 List of stimuli for Experiment 1 - real words

List of stimuli for Experiment 1 - real words
ako, aoraki, aotearoa, aroha, atua, awa, haere mai, haka, hangi, hapū, hīkoi, hōhā, hoki, hongi, hope, hui, ihu, iti, iwa, iwi, kaha, kahurangi, kai, kai moana, kāinga, kaitiaki, kaiwhakahaere, kākāriki, kapa haka, karakia, karanga, karu, katoa, kaumātua, kaupapa, kāwanatanga, kia kaha, kia ora, koha, kōhanga, kōrero, koro, korowai, koru, kōwhai, kuia, kura, kurī, mahi, mana, manuhiri, māori, marae, matariki, mate, mauī, maunga, māwhero, mere, mihi, moana, moko, mokopuna, mōrena, motu, noho, nui, ono, ora, pai, pākehā, pango, papa, papatūānuku, poi, pokohiwi, pounamu, pōwhiri, puke, puku, rangatira, rangatiratanga, rangi, ranginui, reo, rima, ringaringa, roto, rua, tahi, taiaha, taihoa, tamariki, tāne, tāngata, tangata whenua, tangi, taniwha, taonga, tapu, taringa, teina, tekau, tēnā koe, tēnā koutou, tikanga, tiki, tohunga, toru, tuakana, tupuna, turituri, upoko, utu, waewae, waha, wāhi tapu, wahine, wai, waiata, waka, wānanga, waru, whaea, whakapapa, whakarongo, whanau, whāngai, whare, whenua, whero, whitu

3.2.2 List of stimuli for Experiment 1 - nonwords

List of stimuli for Experiment 1 - nonwords
ahatiati, ahiahake, amu, ape, apēhia, arane, ario, eha, eko, haeo, hakaatū, hāno, hepaua, hepiti, hewe, hiamu, hingi, hiu, hoengaima, hoihoko, hōke, horetī, howaka, huengi, hūku, humo, hunge, hupū, ihiri, ikau, iko, inga, ingi, iniata, ino, ire, iru, kawaa, kāweroni, kawha, kawha kawha, kawha nia, kawha whani, kemoramo, kepi, kingiro, kitō, kōioromāpara, komekua, kōmuawhiu, kōua, kūhatapō, kupō, kūro, kūwhati, māheneketoa, makei, mamatōhī, mango, māorawau, māorua, mautāmu, māwi, meahua, mero, mie, mihea, mini, moapi, moeo, mōha, mōnga, mungi, mupati, naipu, nānga, natoi, neetia, nema, ngae, ngaena, ngapoto, ngawhāniti, ngehi, ngema, ngemetata, ngepa, ngoa, ngue, nguta, nia, nia ire, nia kawha, nia pukau, nia uti, nia whani, nia whihia, nito, nitumaotaha, nōitia, nopo, nue, nuhi, nure, nuti, pahapā, pāhāpāko, paihoui, pāuki, paurounu, pāwhi, peu, pewe, pie, pīhu, pikeko, pīngi, poraki, poraki pukau, pote, pukau, pukau nia, pume, pūno, puora, pūrawha, pūtio, pūwhi, rahue, rangu, rapeia, rapuko, reru, rowa, rowhaohi, rukō, rume, rumo, rūne, rungu, rupa, rupo, taetū, tāhuma, takamīa, tākapī, tāmarutō, taongirua, tapopa, tārorangī, tatūhe, tawhengawhi, teaori, temi, tetohe, tetoua, teu, tīahu, tie, tikaweneri, tīkīhiki, tikōha, tikū, tīpe, tīpo, tītā, titapa, titapa pukau, tiwhi, tohiāhia, toketi, touki, tuanapū, tūkeiati, tumeiroruare, tuwhe, uke, uko, unati, uro, uti, waemura, wawemiti, wehao, wereu, whaha, whāhu, whaiē, whakōiaweahua, whāngaki, whani, whani kawha, whani nia, whani poraki, whani titapa, whani whani, whataī, whehu, whenepōna, whengo, wheto, wheu, whihia, whihia nia, whuri, whutarirari, wikuruta, wura, wuri

3.2.3 List of stimuli for Experiment 2

List of stimuli for Experiment 2
aoraki, aroha, atua, awa, haere mai, haka, hangi, hapū, hīkoi, hōhā, hoki, hongi, hui, iti, iwa, iwi, kaha, kahurangi, kai, kai moana, kāinga, kaitiaki, kākāriki, kapa haka, karakia, karanga, katoa, kaumātua, kaupapa, kāwanatanga, kia kaha, kia ora, koha, kōhanga, kōrero, koro, korowai, koru, kōwhai, kuia, kura, kurī, mahi, mana, manuhiri, marae, matariki, mauī, maunga, māwhero, mere, mihi, moana, moko, mokopuna, mōrena, motu, noho, nui, ono, ora, pai, papa, papatūānuku, poi, pounamu, pōwhiri, puke, puku, rangatira, rangatiratanga, rangi, ranginui, reo, rima, ringaringa, roto, rua, tahi, taiaha, taihoa, tamariki, tāne, tāngata, tangata whenua, tangi, taniwha, taonga, tapu, taringa, teina, tekau, tēnā koe, tēnā koutou, tikanga, tiki, tohunga, toru, tuakana, tupuna, utu, waewae, wāhi tapu, wahine, wai, waiata, waka, wānanga, waru, whaea, whakapapa, whakarongo, whanau, whāngai, whare, whenua, whero, whitu

4 Paper figures

4.1 Figure 1

Figure 1: Overview of participants' sociolinguistic profile in Experiment 1. Bars are labeled with their counts for each category.

Figure 1: Overview of participants’ sociolinguistic profile in Experiment 1. Bars are labeled with their counts for each category.

4.2 Figure 2

Figure 2: Overview of participants’ sociolinguistic profile in Experiment 2. Bars are labeled with their counts for each category.

Figure 2: Overview of participants’ sociolinguistic profile in Experiment 2. Bars are labeled with their counts for each category.

4.3 Figure 3

Figure 3: Length distribution of real word stimuli. The length of stimulus (the number of phonemes) is represented on the x-axis and the number of stimuli is represented on the y-axis.

Figure 3: Length distribution of real word stimuli. The length of stimulus (the number of phonemes) is represented on the x-axis and the number of stimuli is represented on the y-axis.

4.4 Figure 4

Figure 4: Average word ratings by phonotactic score. The average rating per word is represented on the y-axis and the phonotactic score is represented on the x-axis. Overlapping labels are omitted.

Figure 4: Average word ratings by phonotactic score. The average rating per word is represented on the y-axis and the phonotactic score is represented on the x-axis. Overlapping labels are omitted.

4.5 Figure 5

Figure 5: Effect plots of: the interaction between the neighbourhood density, the distinction between non vs. real word stimuli, and participant d' (Fig.a); the interaction between the stimulus length and participant d' (Fig.b); the interaction between the presence of macrons and participant d' (Fig.c); and the phonotactic score (Fig.d). For plots involving interactions with paricipant d', the upper panel represents participants with high d', and the lower panel represents participants with low d'. Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

Figure 5: Effect plots of: the interaction between the neighbourhood density, the distinction between non vs. real word stimuli, and participant d’ (Fig.a); the interaction between the stimulus length and participant d’ (Fig.b); the interaction between the presence of macrons and participant d’ (Fig.c); and the phonotactic score (Fig.d). For plots involving interactions with paricipant d’, the upper panel represents participants with high d’, and the lower panel represents participants with low d’. Plots on the left show predicted mean ratings and plots on the right show predicted distributions over ratings.

4.6 Figure 6

Figure 6: Rate of accurate definitions per word. Words are displayed according to their phonotactic score on the x-axis and their accuracy rates are represented on the y-axis. Overlapping labels are not shown.

Figure 6: Rate of accurate definitions per word. Words are displayed according to their phonotactic score on the x-axis and their accuracy rates are represented on the y-axis. Overlapping labels are not shown.

4.7 Figure 7

Figure 7: Effect plots of: the interaction between familiarity and participants’ basic knowledge of Māori (Fig.a); and the presence of macrons (Fig.b).

Figure 7: Effect plots of: the interaction between familiarity and participants’ basic knowledge of Māori (Fig.a); and the presence of macrons (Fig.b).